Site Reliability Engineer (SRE) | Onsite in Denver,CO
Vacancy expired!
Site Reliability Engineer (SRE)
- Job Location : On-site Denver,CO
- Experience : 9+ yrs
- Work Authorization : Open to All
- 8 + years of Development and Operations experience in building and running applications in production that has uptime over 99%. related experience and/or training; or equivalent combination of education and experience
- 3-5 years of experience as a SRE in handling applications that are web scale
- Strong hands-on coding experience in one or more of programming languages such as Python,
- Golang, Java, Bash, etc.
- Good understanding of Observability (monitoring, logging, tracing, metrics), Chaos engineering
- concepts.
- Proficiency in using Application Performance Monitoring (APM) tool New Relic for monitoring,
- logging, tracing.
- Expert level hands on knowledge in public cloud platform AWS and/or Google Cloud Platform.
- Professional level certificate on one of the public clouds is highly desirable.
- Must have hands-on experience in using configuration management systems such as Ansible or
- SaltStack and infrastructure automation tools like Terraform or CloudFormation.
- Should have used altering systems such as Pager Duty.
- Should have implemented solutions around Service Level Indicators (SLIs) and Service Level
- Objectives (SLOs) for services. Measurement should have been within a system and across systems in
- distributed systems
- Should have supported Production Incidents (PIs) on critical applications of a company. Troubleshoot,
- debug, and diagnose operational issues and drive them to closure.
- Understanding of software delivery life cycles, particularly Agile/Lean & DevOps
- Proven experience in handling large scale and growing infrastructure across Data Centers and heterogeneous Cloud platforms
- Experience as a service owner in managing large – geographically diverse stakeholders
- Ability to work with creative – fast growing engineering team and motivate them to deliver their best
- work
- History of driving innovation
- Bachelor’s/Master’s DegreesSkills - Nice to Have:
- Familiarity with handling:
- Containerization – Kubernetes, Docker, Rancher, etc
- Kafka, Yarn, ElasticSearch etc.
- Source code management and Implementation of Security best practices.
- Tech Stack - Python, Falcon, Elastic Search, MongoDB, AWS (SQS S3), Map Reduce
- Networking knowledge
- Understanding of software delivery life cycles, particularly Agile/Lean & DevOps
- Contribution to open source community
- Work with DevOps teams to Build, Release, Monitor and run the services to improve service reliability.
- Write software to automate API-driven tasks at scale and contribute to the product codebase in Java, JS,
- React, Node, Go and Python
- Write automation to reduce toil and eliminate manual tasks that are repeatable.
- Work with Ansible, Puppet, Chef, Terraform or another config management / orchestration suite, know
- where it's broken, work towards fixing them and explore new alternatives
- Maintain services once they are live by measuring and monitoring availability, latency and overall system
- reliability
- Handle cross team performance issues from identification of the cause, determining the areas of
- improvement and driving those actions to closure
- Performance and maturity baselining of DevOps process, tools maturity & coverage, metrics, technology
- and engineering practices
- Define, Measure and improve Reliability Metrics (SLO/SLI), Observability (Monitoring, Logging-Tracing
- solutions), Ops process (Incident, Problem Mgmt.) and streamline – automate release management.
- Build dashboards to provide visibility into performance of the applications.
- Understand the current process, system setup and propose the improvements needed in the processes,
- and technology so that the application exceeds the desired Service Level Objective.
- Strong believer of automation to bring in sustained continuous improvement by automating Toil,
- Runbooks, improving ability of the applications to auto heal leading to improved reliability
